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6.1. Basic sort algorithms 
(From Wikipedia, the free encyclopedia) 


In compnuter Science and mathematics, a Sorting algorithm is an algorithm 
that puts elements of a list in a certain order. The most-used orders are 
numerical order and lexicographical order. Efficient sorting is important to 
optimizing the use of other algorithms (Such as search and merge 
algorithms) that require sorted lists to work correctly; it is also often useful 
for canonicalizing data and for producing human-readable output. More 
formally, the output must satisfy two conditions: 


1. TIhe outpnut is in non-decreasing order (each element is no Smaller than 
the previous element according to the desired total order); 
2. The output is a permutation, or reordering, of the inpnut， 


Since the dawn of computing, the Sorting problem has attracted a great deal 
of research, perhaps due to the complexity of solving it efficiently despite 
its simple, familiar statement. For example, bubble sort was analyzed as 
early as 1956. Although many consider it a solved problem, useful new 
sorting algorithms are still being invented to this day (for example, library 
Sort was first published in 2004). Sorting algorithms are prevalent 让 
introductory compnuter Science classes, where the abundance of algorithms 
for the problem provides a gentle introduction to a variety of core algorithm 
concepts, Such as big O notation, divide-and-conduer algorithms, data 
Structures, randomized algorithms, best, worst and average case analysis， 
time-space tradeoffs, and lower bounds. 


Classification 


Sorting algorithms used in compnuter Science are often classified by: 


。 Computational complexity (worst, average and best behaviour) of 
element comparisons in terms of the Size of the list (n). For typical 
Sorting algorithms good behavior is OO ljog D) and bad behavior is 
(0n2). (See Big O notation) Ideal behavior for a sort is O(n). Sort 
algorithms which only use an abstract key comparison operation 
always need at least Sn log Dn) comparisons on average. 

。 Computational complexity of swaps (for "in place" algorithms). 

。 Memory usage (and use of other compnuter resources). In particular， 
Some Sorting algorithms are "in place", Such that only O(1) or O(log D) 
memory is needed beyond the items being sorted, whijle others need to 
create auxiliary locations for data to be temporarily stored. 

。 Recursion. Some algorithms are either recursive Or non TecursivVe， 
while others may be both (e.g., merge Sort). 

。 Stability: stable sorting algorithms maintain the relative order of 
records with equal keys (ji.e. values). See below for more information. 

。 Whether or not they are a Comparison Sort. A_ comparison Sort 
examines the data only by comparing two elements with a comparison 
Operator. 

。 General method: insertion, exchange, selection, merging, etc. 
Exchange sorts include bubble sort and quicksort. Selection Sorts 
include shaker Sort and heapsort. 


Stability 


Stable Sorting algorithms maintain the relative order of records with equal 
keyshttp:/en.wikipedia.org/wijki/Strict weak _ ordering (i.e. Sort key 
values). That is, a sorting algorithm is stable 让 whenever there are two 
records R and S with the same key and with R appearing before S in the 
original list, R will appear before S in the sorted list. 


When equal elements are indistinguishable, such as with integers, Or more 
generalljy, any data where the entire element is the key, Stability is not an 
issue. Howevem assume that the following pairs of numbers are to be Sorted 
by their first coordinate: 


(4, 1) (3, 7) (3, 1) (5, 6) 


In this case, two different resujlts are possible, one which maintains the 
relative order of records with equal keys, and one which does not: 


(3, 7) (3, 1) (4, 1) (5, 6) (order maintained) 
(3, 1) (3, 7) (4, 1) (5, 6) (order changed) 


Unstable sorting algorithms may change the relative order of records with 
equal keys, but stable Sorting algorithms never do so. Unstable sorting 
algorithms can be Specially implemented to be stable. One way of doing 
this is to artificially extend the key comparison, so that comparisons 
between two objects with otherwise equal keys are decided using the order 
of the entries in the original data order as a tie-breaker Remembering this 
order, however often involves an additional space cost. 


Sorting based on a primary, secondary, tertiary, etc. Sort key can be done by 
any Sorting method, taking all Sort keys into account in comparisons (in 
other words, using a single composite sort key). If a sorting method is 
stable, it is also possible to sort multiple times, each time with one Sort keyVy. 
In that case the Sort keys can be applied in any order where Some key 
orders may leadto asmaller running time. 


6.1.1. Insertion Sort 
(From Wikipedia, the free encyclopedia) 


Insertion sort is a simple sorting algorithm, a comparison Sort in which the 
sorted array (or list) is built one entry at a time. It is much less efficient on 
large lists than more advanced algorithms such as quicksort, heapsort, or 
merge sort, but it has various advantages: 


。 Simple to implement 

。 上 fficient on (quite) Small data sets 

。 上 上 人 ficient on data sets which are already substantially Sorted: it runs in 
Or+ dtime, where dis the number of iDVversions 

。 More efficient in practice than most other Simple On2) algorithms 
SUch as Selection sort or bubble sort: the average time is n2/4 and it is 


linear in the best case 
。 Stable (does not change the relative order of elements with equal keys) 
。 ID-place (only requires a constant amount O(1) of extra memory Space) 
。 It is an online algorithm, in that it can sort a list as it receives 让 . 


Algorithm 


In abstract terms, every iteration of an insertion Sort removes an element 
from the input data, inserting it at the correct position in the already sorted 
list, until no elements are jleft in the input. The choice of which element to 
remove from the input is arbitrary and can be made using almost any choice 
algorithm. 


Sorting is typically done in-place. The resulting array after k iterations 
contains the first k entries of the input array and is sorted. In each step, the 
first remaining entry of the input is removed, inserted into the result at the 
right position, thus extending the result: 


Sorted partial result Unsorted data 
1 N 
1 1 


becomes: 


Sorted partial result Unsorted data 
with each element > X copied to the right as it is compared against X. 
The most common variant, which operates on arraySs, Can be described as: 


1. Suppose we have a method called insert designed to insert a value into 
a sorted sequence at the beginning of an array. It operates by starting at 


the end of the Sequence and shifting each element one place to the 
right until a Suitable position is found for the new element. It has the 
side effect of overwriting the value stored imnmediately after the Sorted 
Sequence in the arTay. 

2. TIo perform insertion sort, start at the left end of the array and invoke 
insert to insert each element encountered into its Correct position. The 
ordered sedquence into which we insert it is stored at the beginning of 
the array in the set of indexes already examined. Each insertion 
Overwrites a single value, but this is okay because it's the value weTe 
inserting. 


A Simple pseudocode version of the complete algorithm follows, where the 
arrays are ZeIo-based: 


insertionSort(array A) 

fori <- 1 to length[A]-1 do 
value <- A[jj 

j <-i1 

while j >= 0 and Alj] > value do 
AUj+H =AUj]; 

Je 


Af[j+1] <- value 


Good and bad input cases 


In the best case of an already sorted array, this implementation of insertion 
sort takes On) time: in each iteration, the first remaining element of the 
input is only compared with the last element of the sorted subsection of the 
array. This same case provides worst-case behavior for non-randomized and 
poorly implemented quicksort, which will take OO2) time to sort an 


already-sorted list. Thus, if an array is sorted or nearly Sorted, insertion sort 
will significantly outperform quicksort. 


The worst case is an array Sorted in reverse order, as every execution of the 
inner loop will have to scan and shift the entire sorted section of the array 
before inserting the next element. Insertion Sort takes O(n2) time in this 
WwWorst case as well as in the average case, which makes it impractical for 
Sorting large numbers of elements. However insertion sort's inner loop is 
Very fast, which often makes it one of the fastest algorithms for sorting 
small numbers of elements, typically less than 10 or so. 


Comparisons to other Sorts 


Insertion sort is very similar to Selection sort, Just like in selection Sort， 
after k passes through the array, the first k elements are in sorted order. For 
selection Sort, these are thek Smajlest elements, while in insertion Sort they 
are whatever the first k elements were in the Unsorted array. Insertion Sort's 
advantage is that it only scans as many elements as it needs to in order to 
place thek + lst element, while selection sort must Scan all remaining 
elements to find the absolute Smallest element. 


Simple calculation shows that insertion sort will therefore usually perform 
about half as many comparisons as Selection Sort. Assuming thek + 1st 
elements rank is random, it will on the average redquire Shifting half of the 
previous k elements over while selection Sort always redquires Scanning all 
unplaced elements. Ifthe array is not in a random order howeverm insertion 
Sort can perform just as many Comparisons as Selection Sort (for a reverse- 
sorted list). It will also perform far fewer comparisons, as few asn -1, 计 the 
data is pre-Ssorted, thus insertion sort is much more efficient 让 the array is 
already sorted or "close to Sorted." It can be Seen as an advantage for some 
real-time applications that selection sort will perform identically regardless 
of the order of the array, while insertion Sorts running time can vary 
Considerably. 


While insertion sort typically makes fewer comparisons than Selection sort， 
it reduires more writes because the inner loop can require shifting large 


Sections of the sorted portion of the array. In general, insertion Sort will 
Write to the array OO2) times while selection Sort will write only OOD) 
times. For this reason, selection Sort may be better in cases where writes to 
memory are Significantly more expensive than reads, Such as EEPROM or 
Elash memory. 


Some divide-and-conqueralgorithmas such as quicksort and mergesort Sort 
by recursively dividing the list into Smaller sublists which are then sorted. A 
useful optimization in practice for these algorithms is to Switch to insertion 
sort for "Sorted enough'" sublists on which insertion Sort outperforms the 
more COmplex algorithms. The size of list for which insertion Sort has the 
advantage varies by environment and implementation, but is typically 
around 8 to 20 elements. 


Variants 


D.L. shell made substantial improvements to the algorithm, and the 
modified version is called Shell sort, It compares elements separated by a 
distance that decreases on each pass. 9hell sort has distinctly improved 
running times in Practical work, with two Simpjle variants reduiring On3/2 ) 
and On4/3) time. 


If comparisons are very costly compared to Swaps, as is the case for 
example with string keys stored by reference or with human interaction 
(Such as choosing one of a pair displayed side-by-side), then using binary 
insertion Sort can be a good Strategy. Binary insertion Sort employs binary 
Search to find the right place to insert new elements, and therefore performs 


| ,oogaz(P) |] 


comparisons in the worst case, which is OO log n). The algorithm as a 
whole still takes On2) time on average due to the series of swaps required 
for each insertion, and since it always uses binary search, the best case is no 
longer Sn) but 2 log DJ). 


To avoid having to make a series of swaps for each insertion, we could 
instead store the input in a linked list, which allows us to insert and delete 
elements in constant time. Unfortunately binary search on a linked list is 
impossible, so we still spend OOn2) time searching. If we instead replace 让 
by a more Sophisticated data structure such as a heap or binary tree, we can 
Significantly decrease both search and insert time. This is the essence of 
heap sort and binary tree_ Sort， 


In 2004, Bender Farach-Colton, and Mosteiro published a new variant of 
insertion Sort called library sort or gapped insertion sort that leaves a Small 
number of unused spaces ("gapS") Spread throughout the array. The benefit 
is that insertions need only shift elements over until a gap is reached. 
Surprising in its simplicity, they show that this sorting algorithm runs with 
high probability in OO ljog n) time. 


Examples 

c++ 上 EXample: 

基 nclude <iostream> 

基 nclude <cstdio> 

/Originally Compiled tested with g++ on Linux 

Using namespace std; 

bool swap(int&, int&); /Swaps Two Ints 

void desc(int* ar inb; /Nothing Just Shows The Array Visually 
int ins_sort(intx, int); /The Insertion Sort Function 

int main() 


{ 


int array[9] = {4, 3, 5, 1, 2, 0, 7, 9, oj /The Original Array 
desc(array, 9); 
*array = ins_Sort(array, 9); 


cout << "Array Sorted Press Enter Io Continue and See the Resultant 
Array" << end| 


getchar(); 

desc(array, 9); 

getchar(); 

return 0; 

} 

int ins_Ssort(int* array, int len) 

{ 

for (inti= 0;i<Jlen; i++) 

{ 

intval = array[jj]; 

int key = 了 

Cout << "key( 上 Key) = "<< key << "tval(Value) = "<< val << end]; 
for (; key >= 1 && array[key-1] >= val; --key) 
{ 


cout << "Swapping Backwardtfrom (key) " << key << "of (Value) "<< 
array[key] << '\tto (key) "<< key-1 


<<" of (Value) " << array[key-1]; 


cout << nt" 一 二 key 的 key-1 < "Nt( 六 才 去 array[key] <<< "<-- 
2 过 去 array[key-1] < 有 有 


Swap(array[keyj, array[key-1]); 
desc(array, 9); 

| 

】 

return *arTay; 

| 

bool swap(int& posl, int& pos2 ) 
{ 

int tmp = pos1TL， 

poSs1 = poSs2; 

pos2 = tmp; 

return true; 

| 

Void desc(int* am int len) 

{ 


cout << endl << "Describing The Given Array" << end]; 


for (inti = 0; i< jen; 计 +) 

COUt <<" NE， 

cout << end]; 

for (inti = 0; i< jlen; 计 +) 

所 证 二 二 下 | 和 人 
cout << end]; 

for (inti = 0; i< jlen; 计 二 ) 
EUR 
cout<<end]; 

for (inti = 0; i< jlen; 计 二 ) 

COUt << "------- NE 
getchar(); 

} 

Python Example: 

def insertion_sort(A): 


for iin range(1, ]en(AJ): 


while(j >= 0 and ATj] > key): 


ALj+1]=AD] 


J 二 全 


ALj+1] = key 


6.1.2. Selection sort 
(From Wikipedia, the free encyclopedia) 


Selection sort is a Sorting algorithm, Specifically an ip-pPlacecomparison 
sort. It has On2) complexity making it inefficient on large lists, and 
generally performs worse than the similar insertion sort. Selection Sort is 
noted for its simplicity and also has performance advantages over more 
complicated algorithms in certain situations. It works as follows: 


1.Find the minimum value in the list 

2. Swap it with the value in the first position 

3. Repeat the steps above for remainder of the list (starting at the Second 
position) 


Effectively, we divide the list into two parts: the sublist of items already 
sorted, which we build up from left to right and is found at the beginning， 
and the Sublist of items remaining to be sorted, occupying the remainder of 
the array. 

Here is an example of this Sort algorithm sorting five elements: 

3L125 12 22 1 

1125 12 22 31 

1112 25 22 31 

1112 22 25 31 


Selection sort can also be used on jlist structures that make add and remove 
efficient, such as a linked list. In this case it's more common to remove the 


minimum element from the remainder of the list, and then insert it at the 
end of the values sorted so far. For example: 


3125 12 22 11 
11 31 25 12 22 
11 12 31 25 22 
11 12 22 31 25 


11 12 22 25 31 


Implementation 


The following is a C/C++ implementation, which makes use of a Swap 
function: 


Void selection9ort(int a[j, int Size) 
{ 

int il, j, min; 

for li = 0;i<size - 1; i++) 

{ 

min = ji 

for (j] = i+1; j < Size; j++) 

{ 

让 (a[j] < amin]) 


| 


min = j; 


Swap(a[ij, amin]); 

} 

} 

Python exampjle: 

def selection_Ssort(A): 

for i in range(0, len(A)-1): 
min=Afil] 

pos = 1 

for j in range(i+1l, len(A)): 
if( ADj]< min ): 
min=Al] 

pos =] 

Alposj = Ab 


Alil = min 


Analysis 


Selection sort is not difficult to analyze compared to other Sorting 
algorithms since none of the loops depend on the data in the array. Selecting 


the ljowest element requires Scanning all n elements (this takesn -1 
comparisons) and then Swapping it into the first position. Finding the next 
]owest element requires Scanning the remainingn - 1elements and So on， 
forn-T)+n-2)+.+2+1=nn-1)/12=On2)comparisons (See 
arithmetic progression). Each of these scans redquires one Swap forn -1I 
elements (the final element is already in place). Thus, the comparisons 
dominate the running time, which is On2). 


Comparison to other Sorting Algorithms 


Among Simple average-case On2) algorithms, selection sort always 
outperforms bubble sort and gnome sort, but is generally outperformed by 
insertion sort. Insertion Sort is very Similar in that after the kth iteration, the 
first k elements in the array are in sorted order. Insertion Sorts advantage is 
that it only scans as many elements as it needs to in order to place the k + 
1st element, while selection Sort must Scan all remaining elements to find 
thek + 1Lst element. 


Simple calculation shows that insertion Sort will therefore usually perform 
about half as many comparisons as selection Sort, although it can perform 
just as many or far fewer depending on the order the array was in prior to 
Sorting. It can be seen as an advantage for Some real-time applications that 
selection sort will perform identically regardless of the order of the array， 
while insertion sorts running time can vary considerably. However, this is 
more often an advantage for insertion sort in that it runs much more 
efficiently 让 the array is already Sorted or "close to Sorted." 


Another key difference is that selection sort always performs O(n) swaps， 
while insertion sort performs On2) swaps in the average and worst cases. 
Because Swaps require writing to the array, Selection sort is preferable 计 
writing to memory is Significantly more expensive than reading, Such as 
when dealing with an array stored in EEPROM or Flash. 


Finally, selection Sort is greatly outperformed on larger arrays by Onlog D) 
divide-and-conquer algorithms such as quicksort and mergesort. However， 
insertion Sort or Selection sort are both typically faster for small arrays (ie 


less than 10-20 elements). A_useful optimization in practice for the 
recursive algorithms is to Switch to insertion Sort or selection sort for "Small 
enough'" sublists. 


Variants 


Heapsort greatly improves the basic algorithm by using an implicitheapdata 
Structure to speed up finding and removing the lowest datum. 下 
implemented correctly, the heap will allow finding the next lowest element 
in OUogn) time instead of On) for the inner loop in normal selection sort， 
reducing the total running time to On log D). 


A bidirectional variant of selection Sort, called cocktail sort, is an algorithm 
which finds both the minimum and maximum values in the list in every 
pass. This reduces the number of scans of the list by a factor of 2， 
eliminating some loop overhead but not actually decreasing the number of 
comparisons or Swaps. Note, however that cocktail sort more often refers to 
abidirectional variant of bubble sort. 


Selection Sort can be implemented as a Stable sort. If, rather than Swapping 
in step 2, the minimum value is inserted into the first position (that is, all 
intervening items moved downj, the algorithm is stable. However this 
modification leads to On2 ) writes, eliminating the main advantage of 
selection Sort over insertion Sort, which is always stable. 


6.1.3. Bubble sort 
(From Wikipedia, the free encyclopedia) 


Bubble sort is a simple sorting algorithm. It works by repeatedly stepping 
through the list to be sorted, comparing two items at a time and Swapping 
them ithey are in the wrong order. The pass through the list is repeated 
until no Swaps are needed, which means the list is sorted. The algorithm 
gets its name from the way smaller elements "bubble'" to the top (i.e. the 
beginning) of the list via the swaps. (Another opinion: it gets its name from 


the way greater elements "bubble'" to the end.) Because it only uses 
CoOmparisons to operate on elements, it is a comparison Sort. This is the 
easiest COmparison sort to implement. 

A Simple way to express bubble Sort in pseudocode is as follows: 
procedure bubbleSort( A : list of sortabjle items ) defined as: 

do 

Swapped := false 

for each iin 0to length(A ) -2do: 

iALilj>ALir+lij]then 

Swap(ALil,AlLi+LI]) 

Swapped := true 

end 让 

end for 

while swapped 

end procedure 

The algorithm can also be expressed as: 

procedure bubbleSort( A : list of sortabjle items ) defined as: 

for each iin 1to length(A) do: 

for each j ip length(A) downto i + 1 do: 

ifAljj<ADj-1l1jthen 


swap(Ar[j],Arj-11]) 


end 让 

end for 

end for 

end procedure 


This difference between this and the first pseudocode implementation is 
discussed later in the article. 


Analysis 


Best-case performance 


Bubble sort has best-case complexity 2n). When a list is already sorted， 
bubblesort will pass through the list once, and find that it does not need to 
Swap any elements. Thus bubble sort will make onlyn comparisons and 
determine that list is compjetely sorted. It will also use considerably less 
time than On2) 计 the elements in the unsorted list are not too far from their 
Sorted places. M 开 瓦 ... 


Rabbits and turtles 


The positions of the elements in bubble sort will play a large part 记 D 
determining its performance. Large elements at the top of the list do not 
pose a problem, as they are quickly swapped downwards. Small elements at 
the bottom, howeverm as mentioned earlier, move to the top extremely 
Slowly. This has led to these types of elements being named rabbits and 
turtles, respectively. 


Various efforts have been made to eliminate turtles to improve Upon the 
Speed of bubble sort. Cocktail sort does pretty well but it still retains On2) 
worst-case COmplexity. Comb sort compares elements large gaps apart and 


can move turtles extremely quickly, before proceeding to smaller and 
smaller gaps to Smooth out the list. Its average Speed is comparable to faster 
algorithms like Quicksort， 


Alternative implementations 


One way to optimize bubble sort is to note that, after each pass, the largest 
element will always move down to the bottom. During each comparison , 放 
is clear that the largest element will move downwards. Given a list of size Dn， 
the nth element will be guaranteed to be in its proper place. Thus it suffices 
to Sort the remainingn - 1 elements. Again, after this pass, then - lth 
element will be in its final place. 


In pseudocode, this will cause the following change: 
procedure bubbleSort( A : list of sortabjle items ) defined as: 
n:=length(A) 

do 

Swapped := false 

n:=Dn-1I 

for each in 0 ton do: 

让 ALil]>AlLi+Lijthen 

Swap(ALijl,ALi+Ll]y) 

Swapped := true 

end 让 

end for 


while swapped 


end procedure 


We can then do bubbling passes over increasingly Smaller parts of the list. 
More precisely, instead of doing n2 comparisons (and swaps), we can use 
onlyn+(n-1)+(n-2)+…+1Icomparisons. This Sums up tonmnr+1T) 7/ 2， 

which is still On2), but which can be considerably faster in practice. 


In practice 


Although bubble sort is one of the Simplest sorting algorithms to understand 
and implement its On2) complexity means it is far too inefficient for use 
on lists having more than a few elements. Even among Simple O(n2) sorting 
algorithms, algorithms like ipsertion sort are usually considerably more 
efficient. 


Dueto its simplicity bubble sort is often used to introduce the concept of an 
algorithm, or a sorting algorithm, to introductory CoOmPUter Science 
students. However some researchers Such as Owen Astrachan have gone to 
great lengths to disparage bubble sort and its continued popularity in 
Compnuter Science education, recommending that it no Jonger even be 
taught. 


The Jargon file, which famously calls bogosort "the archetypical perversely 
awful algorithm'", also calls bubble sort "the generic bad algorithm'". Donald 
Knuth, in his famous Ihe Art of Computer Programming, concluded that 
"the bubble sort seems to have nothing to recommend it, exXcept a catchy 
name and the fact that it leads to some interesting theoretical problems'"， 
some of which he discusses therein . 


Bubble sort is asymptotically edquivalent in running time to ipsertion sort 训 
the worst case, but the two algorithms differ greatly in the number of swaps 
necesSsary. EXperimental results such as those of Astrachan have also shown 
that insertion sort performs considerably better even on random lists. For 
these reasons many modern algorithm textbooks avoid using the bubble sort 
algorithm in favor of insertion sort. 


Bubble sort also interacts poorly with modern CPU hardware. It requires at 
least twice as many writes as insertion Sort, twice as many cache misses， 
and asymptotically more branch mispredictions. Experiments by Astrachan 
Sorting strings in Java Show bubbje sort to be roughly 5 times Slower than 
insertion sort and 409%6 slower than selection sort. 


6.2. Effectively sorting algorithms 


6.2.1. Shell sort 
(From Wikipedia, the free encyclopedia) 


Shell sort is a Sortingalgorithm that is a generalization of insertion sort， 
With two observations: 


。 insertion sort is efficient 计 the input is "almost sorted', and 
。 insertion Sort is typically inefficient because it moves values just one 
position at a time. 


Implementation 


The original implementation performs @n2) comparisons and exchanges in 
the worst case. A minor change given in V. Pratt's book improved the bound 
to On log2 D). This is worse than the optimal comparison sorts, which are 
On log D). 


Shell sort improves insertion sort by Comparing elements Separated by a gap 
of several positions. This lets an element take "bigger Steps" toward its 
expected position. Mujtiple passes over the data are taken with Smaljler and 
Smaller gap Sizes. The last step of Shell sort is a plain insertion sort, but by 
then, the array of data is guaranteed to be almost sorted. 


Consider a small value that is initially stored in the wrong end of the array. 
Using an O(n2) sort Such as bubble sort or insertion sort, it will take 
roughlyn comparisons and exchanges to move this value all the way to the 
other end of the array. Shell sort first moves values using giant step Sizes, SO 
asmall value will move a long way towards its final position, with just a 
few comparisons and exchanges. 


One can visualize Shellsort in the following way: arrange the list into a 
table and Sort the columns (using an iDSertion Sort). Repeat this process， 
each time with smaller number of longer columns. At the end, the table has 
only one column. While transforming the list into a table makes it easier to 


Visualize, the algorithm itself does its sorting in-place (by incrementing the 
index by the step Size, ji.e. Using i += Step_Size instead of i++). 


For example, consider a list of numbers like[ 13 14 94 33 82 25 59 94 65 
2345 27 73 25 39 10 |]. If we started with a step-size of 5, we could 
Visualize this as breaking the list of numbers into a table with 5 columns. 
This would look like this: 


13 14 94 33 82 

25 59 94 65 23 

45 27 73 25 39 

10 

We then sort each column, which gives us 
10 14 73 25 23 

13 27 94 33 39 

25 59 94 65 82 

45 


When read back as a single list of numbers, we get[ 10 14 73 25 23 13 27 
94 33 39 25 59 94 65 82 45 |. Here, the 10 which was all the way at the end， 


has moved all the way to the beginning. This list is then again Sorted using 
a 3-gap sort, and then 1-gap sort (Simple insertion Sort). 


Gap sequence 


Original 3295 16 82 246635 19 75 54 40 43 93 68 


After5-sort | 32 35 16 68 24 40 4 19 .54 人 5 93 82 |6swaps 


After 3-sort | 32 19 16 43 24 40 54 35 75 68 晤 93 82 |5swaps 


After 1-sor | 16 19 24 32 35 40 43 54 66 68 75 82 93 95 115 swaps 


The shellsort algorithm in action 


The gap Sequence is an integral part of the shellsort algorithm. Any 
increment Sequence will work, So long as the last element is 1. The 
algorithm begins by performing a gap insertion sort, with the gap being the 
first number in the gap Sequence. It continues to perform a gap insertion 
sort for each number in the sequence, until it finishes with a gap of 1. When 
the gap is 1, the gap insertion Sort is Simply an ordinary iDSsertion sort， 
guaranteeing that the final list is sorted. 


The gap Sequence that was originally suggested by Donald Shell was to 
begin withN/2andtohalvethe number until it reaches 1. While this 
Seduence provides Significant performance enhancements over the quadratic 
algorithms such as 问 Ssertion sort, it can be changed slightly to further 
decrease the average and worst-case running times. Weiss' textbook[4] 
demonstrates that this Sequence allows a worst case O(n2) Sort, 计 the data is 
initially in the array as (Small_1, large_1, small _ 2, large 2, ...) -that is, the 
Upper half of the numbers are Placed, in sorted order, in the even index 
locations and the lower end of the numbers are placed similarly in the odd 
indexed locations. 


Perhaps the most crucial property of Shellsort is that the elements remain k- 
Sorted even as the gap diminishes. For instance, if a list was 5-Sorted and 
then 3-Ssorted, the list is now not only 3-Sorted, but both 5- and 3-sorted. 开 
this were not true, the algorithm would undo work that it had done ip 
previous iterations, and would not achieve such a low running time. 


Depending on the choice of gap sequence, Shellsort has a proven worst- 
case running time of On2) (using Shell's increments that start with 1/2 the 
array Size and divide by 2 each time), On3 /2) (using Hibbard's increments 
of 2k - 1), On4713) (using Sedgewick's increments of 9(4i) - 9(2iD + 1, OF 
4i+1+3(20+1),orOnlog2n), andpossibly unproven better running 
times. The existence of an O(nlogn) worst-case implementation of Shellsort 
remains an Open research question. 


The best known Sequence is 1, 4, 10, 23, 57, 132, 301, 701. Such a Shell 
Sort is faster than an insertion sort and a heap_ sort, put if it is faster than a 
quicksort for small arrays (less than 50 elementsj, it is slower for bigger 
arraySs. Next gaps can be computed for instance with : 


nextgap = round(gap * 2.3) 


Shell sort algorithm in C/C++ 


implementation of the algorithm in C/C++ for sorting an array of integers. 
The increment Sequence used in this exzample code gives an O(n2) worst- 
case rUnning time. 


void Shell_sort(int A[j, int size) 
{ 

int jl, j, increment, temp; 
increment = Size / 2; 


while (increment > 0) 


{ 

for (i=increment; i < Size; 计 十 ) 

{ 

三 汪 

temp = Ai; 

while ((0 >= increment) 信 & (ATj-increment] > temp)) 
{ 

AU]= AU - increment]; 

] =j - increment; 

} 

AIj] = temp; 

| 

if (increment == 2) 

increment = 二; 

else 

increment = (int) (increment / 2.2); 
| 

} 


Shell sort algorithm in Java 


The Java implementation of Shell sort is as follows: 

public static void shellSort(int[] a) 攻 

for (int increment = a.length / 2; 

increment > 0; 

increment = (increment ==2?1:(nb Math.round(ncrement /2.2))){ 
for (inti = increment; i< a.length; i++) { 

for (int j =i j >= increment gs alj - increment] > alj]; j -= increment) { 
int temp = al[jj; 

alj] = alj - ipcrementj; 

alj - increment] = temp; 

】 

】 


Shell sort algorithm in Python 
Here it is: 

def shellsort(a): 

def new_increment(a): 


1i=int(len(a) /2) 


yield 1i 

while ji != 1: 

诺 i == 2: 

1i=1 

else: 

1=int(numnpyIround(i2.2)) 

yield 1i 

for increment in new_increment(a): 
for i in xrange(increment, len(a)): 
for j in xrange(i, increment-1, -increment): 
ialj -increment] < aDjj: 

break 

temp = aljj; 

alj] = alj - ipcrementj 

alj - increment] = temp 


Teturm a 


6.2.2. Heap sort 


(From Wikipedia, the free encyclopedia) 


Heapsort is a comparison-basedsorting algorithm, and is part of the 
selection sort family. Although somewhat slower in practice on most 


machines than a good implementation of quicksort, it has the advantage of a 
worst-case On log n) runtime. Heapsort is an iDE-place algorithm, but is not 
a stable sort, 


Overview 


Heapsort inserts the input list elements into a heap data structure. The 
largest value (in a max-heap) or the smallest value (in a min-heap) are 
extracted until none remain, the values having been extracted in Sorted 
order. The heap's invariant is preserved after each extraction, so the only 
cost is that of extraction. 


During extraction, the only Space required is that needed to store the heap. 
In order to achieve constant Space overhead, the heap is stored in the part of 
the input array that has not yet been Sorted. (TIhe structure of this heap is 
described at Binary heap: Heap implementation.) 


Heapsort uses two heap operations: insertion and root deletion. Each 
extraction places an element in the last empty location of the array. The 
remaining prefix of the array stores the unsorted elements. 


Variations 


。 The most important variation to the simple variant is an improvement 
by R.W.Floyd which gives in Practice about 2596 Speed improvement 
by using only one comparison in each siftup run which then needs to 
be followed by a sittdown for the original child; moreover it is more 
elegant to formulate. Heapsort's natural way of indexing works on 
indices from 1 up tothe number of items. Therefore the start address 
of the data should be shifted such that this logic can be implemented 
avoiding unnecessary +/- 1 offsets in the coded algorithm. 


。 Ternary heapsort uses a ternary heap instead of a binary heap; that is， 
each element in the heap has three children. It is more complicated to 
program, but does a constant number of times fewer swap and 


comparison operations. This is because each step in the shift operation 
of a ternary heap requires three comparisons and one Swap, whereas iD 
a binary heap two comparisons and one Swap are redquired. The ternary 
heap does two steps in less time than the binary heap redquires for three 
steps, which multiplies the index by a factor of 9 instead of the factor 8 
of three binary steps. Ternary heapsort is about 1296 faster than the 
Simple variant of binary heapsort.[citation needed| 


e。 The Smoothsort Sorting algorithm is a variation of heapsort developed 
byEdsger Dijkstra in 1981. Like heapsort, smoothsorts upper bound is 
oO log n). The advantage of smoothsort is that it comes closer to OO) 
time 这 the input is already sorted to Some degree, whereas heapsort 
averages On log n) regardless of the initial sorted state. Due to its 
Complexity, Smoothsort is rarely Used. 


Comparison with other sorts 


Heapsort primarily competes with quicksort, another very efficient general 
purpose nearly-in-place comparison-based sort algorithm. 


Quicksort is typically somewhat faster due to better cache behavior and 
other factors, but the worst-case running time for quicksort is On2), which 
is unacceptable for large data sets and can be deliberately triggered given 
enough knowledge of the implementation, creating a Security Tisk. See 
quicksort for a detailed discussion of this problem, and possible solutions. 


Thus, because of the OO log n) upper bound on heapsort's running time and 
Constant Upper bound on its auxiliary storage, embedded systems with real- 
time constraints or Systems concerned with security often use heapsort. 


Heapsort also competes with merge sort, which has the same time bounds， 
but requires SOD) auxiliary Space, whereas heapsort reduires only a constant 
amount. Heapsort also typically runs more duickly in practice on machines 
with Small or slow data_ caches. On the other hand, merge Sort has Several 
advantages Over heapsort: 


。 Like quicksort, merge Sort on arrays has considerably better data cache 
performance, often outperforming heapsort on a modern desktop PC， 
because it acCcesses the elements in order. 

e。 Merge Sort is a Stable sort. 

e。 Merge Sort Parallelizes better; the most trivial way of parallelizing 
merge Sort achieves close to linear Speedup, while there is no obvious 
way to parallelize heapsort at all. 

e。 Merge Sort can be easily adapted to operate on linked lists and very 
large lists stored on Slow-to-access media Such as disk storage or 
Detwork attached storage. Heapsort relies strongly on Iandom access， 
and its poor locality of reference makes it very slow on media with 
]ong accCess times. 


An interesting alternative to Heapsort js Introsort which combines quicksort 
and heapsort to retain advantages of both: worst case Speed of heapsort and 
average Speed of quicksort. 


Pseudocode 


The following is the "Simple" way to implement the algorithm, 记 D 
pseudocode, where Swap is used to Swap two elements of the array. Notice 
that the arrays are Zero_ based in this example. 


function heapSort(a, count) is 

input: an Unordered array a of length count 
(first place a in max-heap ordem 

heapify(a, count) 

end := Count - 工 

while end > 0 do 


(Swap the root(maximum value) of the heap with the last element of the 
heap) 


Swap(a[end], al0]) 

(decrease the Size of the heap by one So that the previous max value will 
stay in its Proper placement) 

end :=end -1 工 

(put the heap back in max-heap order) 

siftDown(a, 0, end) 

function heapify(a,count) is 

(Start is assigned the index in a of the last parent node) 

Start := COuUnt 二 2 - 工 

while start > 0 do 


(Sift down the node at index start to the proper place Such that all nodes 
below 


the start index are in heap orde) 

siftDown(a, start，count-1) 

Start := Start - 工 

(after sifting down the root all nodes/elements are in heap ordem) 
function siftDown(a, start, end) is 

input: end represents the limit of how far down the heap 

to Sift. 

root := Start 


while root* 2+1<end do (While the root has at least one child) 


child :=root*x2+1(roots2+1 points to the left child) 

(If the child has a sibling and the childs value is less than its sibling's.…) 
if child < end and af[child] < afchild + 1] then 

child := child + 1 (then point to the right child instead ) 

ia[root] < a[child] then (out of max-heap ordem) 

Swap(a[root], a[child]) 

root := child (repeat to continue sifting down the child now) 

else 

return 

The heapify function can be thought of as Successively inserting into the 
heap and sifting up. The two versions only differ in the order of data 
processing. The above heapify function starts at the bottom and moves Up 
while sifting down (bottom-up). The following heapify function starts at the 
top and moves down while sifting up (top-down). 

function heapify(a,count) is 

(end is assigned the index of the first (leftb) child of the root) 

end := 工 

while end < count 

(Sift up the node at index end to the proper place Such that all nodes above 
the end index are in heap order) 

siftUp(a, 0, end) 


end :=end+1I 


(after sifting up the last node all nodes are in heap ordem) 
function siftUp(a, Start, end) is 

input: start represents the limit of how far up the heap to sift. 
end is the node to sift up. 

child := end 

while child > start 

parent := |(child - 1D) = 2 

让 a[parent] < a[child] then (out of max-heap order) 
Swap(a[parentj, a[childj]) 

child := parent (repeat to continue sifting up the parent now) 
else 

return 


It can be Shown that both variants of heapify run in OO) time.[citation 
Deeded| 


C-code 


Below is an implementation of the "standard" heapsort (also called bottom- 
Up-heapsort). It is faster on average (See KKnuth. Sec. 5.2.3, EX. 18) and even 
better in worst-case behavior (1.5n ljog n) than the simple heapsort (2n log 
n). The sift_ in routine is first a sift_up of the free position followed by a 

sift down of the new item. The needed data-comparison is only in the 
macro data ij LESS_THAN_ for easy adaption. 


This code is flawed - See talk page 


/#* Heapsort based on ideas of J.WWilliams/R.W.Floyd/S.Carlsson / 
#define data ji LESS_THAN_(other (data[i] < other) 
#define MOVE ji TO _free{ data[free]j=data[i; free=i; } 


void sift_in(unsigned count, SORTTYPE *data, unsigned free_in， 
SORTTYPE next) 


{ 

Unsigned ji 

unsigned free = free_in; 

// sift up the free node 

for (i=2*free;i<count;i+=i 

{ 证 (datai LESS_THAN_(data[i+1])) i++; 
MOVE_ ji TO_free 

} 

// Special case in sift up 让 the last inner node has only 1 child 
计 (i==count) 

MOVE_ ji TO_free 

// sift down the new item next 

while( ((i=free/2)>=free_in) 攻 & data ji LESS_THAN_Cnext)) 
MOVE_ ji TO_free 

data[free] = next， 


} 


Void heapsort(unsigned count, SORITITYPE *data) 
{ 

Unsigned j; 

让 (count <= 1) return; 

data-=1; / map addresses to indices 1 til count 
/build the heap structure 

for(0j=count / 2; j>=1; j--) { 

SORTTYPE next = data[jj; 

sift_in(count, data, j, next); 

} 

// Search next by next remaining extremal element 
for(0j= count - 1; j>=1; j--) 

SORTTYPE next = data[j + 1]; 

data[j + 1] = data[1]; / extract extremal element from the heap 
sift_in(j, data, 1, next); 

| 


6.2.3. Quicksort 


(From Wikipedia, the free encyclopedia) 


Quicksort is a well-known Sortingalgorithm developed by CA.R. Hoare 
that, on average, makes 


9(m log 7m) 


(big Onotation) comparisons to Sort n items. However in the worst case, 并 
makes On2) comparisons. Typically, quicksort is significantly faster ip 
practice than other 


9lmn log 7m) 


algorithms, because its inner loop can be efficiently imnplemented on most 
architectures, and in most real-world data it is possible to make design 
choices which minimize the possibility of requiring quadratic time. 


Quicksort is a comparison sort and is not a Stable sort. 


The algorithm 


Quicksort sorts by employing a divide and conquer strategy to divide a list 
into two Sub-lists. 


The steps are: 


1. Pick an element, called a Pivot, from the jlist. 

2.Reorder the list so that all elements which are less than the pivot come 
before the pivot and so that all elements greater than the pivot come 
after it (equal values can go either way). After this partitioning, the 
pivot is in its final position. This is called the partition operation . 

3. Recursively sort the sub-list of lesser elements and the Sub-list of 
greater elements. 


The base case of the recursion are lists of size zero or one, which are always 
Sorted. The algorithm always terminates because it puts at least one element 
in its final place on each iteration (the loop invariant). 


In simple pseudocode, the algorithm might be expressed as: 


function quicksort(array) 

var list less, pivotList，greater 

过 length(array) < 1 

return arTray 

select a pivot value pivot from array 

for each Xin array 

让 X < pivot then add x to less 

让 X = pivot then add x to pivotList 

计 X > pivot then add xX to greater 

return concatenate(quicksort(less), pivotList, quicksort(greateD)) 


Notice that we only examine elements by comparing them to other 
elements. This makes quicksort a coOmparison sort. 


Version with in-place partition 


3|7|815|211|191514 
AAA 
3|7|18|14|2|11|9|515| 


3|4|2|7|8|1|9|1515 
oaogeo 
gogogcS6 
3|4|2|11|1515|19|s8|7| 


Partition 


In-place partition in action on a small list. The boxed element is the pivot 
element, blue elements are less or edqual, and red elements are larger. 


The disadvantage of the simple version above is that it requires On) extra 
storage Space, which is as bad as mergesort (See big-O notation for the 
meaning of 9). The additional memory allocations required can also 
drastically impact Speed and cache performance in practical 
implementations. There is a more complicated version which uses an ip- 
Place partition algorithm and can achieve O(log Dn) Space use on average for 
good pivot choices: 


function partition(array, left, right, pivotIndex) 
pivotValue := array[pivotIndex] 

Swap( array pivotIndex, right) / Move pivot to end 
StoreIndex := left - 1 

for i from left to right-1L 


if array[i] <= pivotValue 


StoreIndex := storeIndex + 1 

Swap( array, storeIndex, iD) 

Swap( array, right, storeIndex+1) / Move pivot to its final place 
return storeIndex+1 


This form of the partition algorithm is not the original form; multiple 
variations can be found in various textbooks, Such as versions not having 
the storeIndex. However this form is probably the easiest to understand. 


This is the in-place partition algorithm. It partitions the portion of the array 
between indexes left and right, inclusively, by moving all elements less than 
or edual to a[pivotIndexj] to the beginning of the subarray, leaving all the 
greater elements following them. In the process it also finds the final 
position for the pivot element, which it returns. It temporarily moves the 
pivot element to the end of the subarray, So that it doesn't get in the way. 
Because it only uses exchanges, the final list has the same elements as the 
original list. Notice that an element may be exchanged multiple times 
before reaching its final place. 


Once we have this, writing quicksort itsejlf is easy: 
function quicksort(array, left, right) 

ifright > jetft 

select a pivot index (e.g. pivotIndex := leftb) 
pivotNewIndex := partition(array, left, right, pivotIndex) 
quicksort(array, left, pivotNewIndex-1) 


quicksort(array, pivotNewIndex+1, right) 


Parallelization 


Like Pergesort, quicksort can also be easily Parallelized due to its divide- 
and-conquer nature. Individual in-place partition operations are difficult to 
parallelize, but once divided, different sections of the list can be sorted 让 
parallel. If we have p processors, we can divide a list of n elements into p 
Sublists in OO) average time, then Sort each of these ip 


oo 
average time. Ignoring the On) preprocessing, this is linear speedup. Given 
On) processors, only OO) time is required overall. 


One advantage of parallel quicksort over other parallel sort algorithms is 
that no Synchronization is required. Anew thread is started as Soon as a 
Sublist is available for it to work on and it does not communicate with other 
threads. When all threads complete, the Sort is done. 


Other more Sophisticated parallel sorting algorithms can achieve even better 
time bounds. For example, in 1991 David Powers described a parajljelized 
quicksort that can operate in O(og nj) time given enough proceSsSors by 
performing partitioning implicitly[I1]. 


Formal analysis 


From the initial description it's not obvious that quicksort takes On log 
njtime on average. Its not hard to see that the partition operation, which 
simply loops over the elements of the array once, uses OOD) time. 
versions that perform concatenation, this operation is also On). 


In the best case, each time we perform a partition we divide the list into two 
nearly equal pieces. This means each recursive call processes a list of half 
the Size. Conseduently we can make only (log n) nested calls before we 
reach a list of size 1. This means that the depth of the calltree is OUog m). 
But no two calls at the Same level of the call tree process the Same part of 
the original list; thus, each level of calls needs only OO) time all together 
(each call has some constant overhead, but Since there are only On) calls at 


each level, this is subsumed in the On) factor). The resuljt is that the 
algorithm uses only On log n) time. 


An alternate approach is to set Up a recurrence relation for TD) factom, the 
time needed to sort a list of size n. Because a Single quicksort call involves 
OO) factor) work plus two recursive calls on lists of size n/2 in the best 
case, the relation would be: 


Tn) = O(m) 十 2T(5). 


The master theorem tells us that 


Tm)= 昌 mn log 7) 


In fact its not necessary to divide the list this precisely; even 计 each pivot 
Splits the elements with 999%6 on one Side and 19%6 on the other (or any other 
fixed fraction), the call depth is still limited to (100log mn), so the total 
running time is still On log D). 


In the worst case, however the two Sublists have Size 1 andn -1,andthe 
call tree becomes a linear chain of n nested calls. The ith call does 
CO(7m 一 2 


work, and 


开 


》(n 一 i=On32) 


i=0 


. The recurrence relation is: 


This is the same relation as for insertion sort and selection sort, and it Solves 
to TD) = On2). 


Randomized quicksort expected complexity 


Randomized quicksort has the desirable property that it requires only On 
log n)expected time, regardless of the input. But what makes random pivots 
a good choice” 


Suppose we Sort the list and then divide it into four parts. The two parts iD 
the middle will contain the best pivots; each of them is larger than at least 
259%0 of the elements and Smaller than at least 25956 of the elements. If we 
could consistently choose an element from these two middle parts, we 
would only have to split the list at most 2log2n times before reaching jlists 
of size 1, yielding an On log n) algorithm. 


Unfortunately, a random choice will only choose from these middle parts 
half the time. The surprising fact is that this is good enough. Imagine that 
you are fipping a coin over and over until you get k heads. Although this 
could take a long time, on average only 2k flips are required, and the 
chance that you won't get k heads after 100k flips is infinitesimally Small. 
By the same argument, quicksort's recursion will terminate on average at a 
call depth of only 2log2n. But if its average call depth is O(Uog n), and each 
level of the call tree processes at mostn elements, the total amount of work 
done on average is the product, O(n log D). 


Average complexity 


Even 让 we aren't able to choose pivots randomly quicksort still requires 
only OQ log n) time over all possible permnutations of its input. Because 
this average is simply the sum of the times over all permnutations of the 
input divided by n factorial, its eduivalent to choosing a random 
permnutation of the input. When we do this, the pivot choices are essentially 
random, leading to an algorithm with the same running time as randomized 
quicksort. 


More precisely, the average number of comparisons over all permutations 
of the input sequence can be estimated accurately by solving the recuIrence 
relation: 


1 名 一 】 ， 
C(m) 一 寻 一 1 >》 (CO +Cmn 一 一 1))=22jnmn 一 1.39nlog27. 


1 一 人 


Here,n- 1isthenumber of comparisons the partition uses. Since the pivot 
is equally likely to fall anywhere in the sorted list order the Sum is 
averaging over all] possible Splits. 


This means that, on average, quicksort performs only about 3996 worse than 
the ideal number of comparisons, which is its best case. In this Sense it is 
closer to the best case than the worst case. This fast average runtime is 
another reason for quicksorts practical dominance over other sorting 
algorithms. 


Cn = Cn-1T) + C(n/2) + CCn/2) 
= (n-1) + 2COn/2) 

= (n-1D) + 2(n/2) -1+ 2CCn/4)) 
=n+n+4C/4) -1-2 


=D+n+n+8cn8)-1-2-14 


kn + 2AkC(OV2Ak))-(L1+2+4+..... + 2A(k-1)) 
where ljog2n >k>0 
= kn + 2^AkC(nV(2Ak) -2Ak+1I 


->Dlog2n +DC(1) -n+1I. 


Space complexity 
The space used by quicksort depends on the version used. 


Quicksort has a Space complexity of O(Uog n), even in the worst case, when 
it is carefully implemented such that 


。 in-place partitioning is used. This requires O(1). 

。 After partitioning, the partition with the fewest elements is 
(Tecursively) Sorted first, redquiring at most OUlog n) Space. Then the 
other partition is sorted using tail-recursion or iteration. 


The version of quicksort with in-place partitioning uses only constant 
additional space before making any recursive call. However if it has made 
O(og n) nested recursive calls, it needs to store a constant amount of 
information from each of them. Since the best case makes at most OUlog D) 
nested recursive calls, it uses O(log Dn) Space. The worst case makes On) 
nested recursive calls, and So needs On) Space. 


Weareeliding a small detail here, however If we consider sorting 
arbitrarily large lists, we have to keep iDR mind that our variables like left 
and right can no ljonger be considered to occupy constant Space; it takes 
OUog n) bits to index into a list of n items. Because we have variables like 
this in every stack frame, in reality quicksort requires O(log2D) bits of space 
in the best and average case and On ljog n) Space in the worst case. This 
isnt too terrible, though, since 这 the list contains mostly distinct elements， 
the list itself will also occupy O(Ulog n) bits of space. 


The not-in-place version of quicksort uses O(D) Space before it even makes 
any recursive calls. In the best case its Space is still limited to On), because 
each level of the recursion uses half as much space as the last, and 
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Its worst case is dismal, redquiring 


> 一 +D= 96(m 


Space, far more than the list itself. If the list elements are not themselves 
constant Size, the problem grows even larger; for example, 计 imost of the list 
elements are distinct, each would require about O(log D) bits, leading to a 
best-case On log n) and worst-case O(n2 log n) Space requirement. 


Selection-based pivoting 


A selection algorithm chooses the kth smallest of a list of numbers; this is 
an easier problem in general than Sorting. One Simple but effective Selection 
algorithm works nearly in the same manner as quicksort, except that instead 
of making recursive calls on both sublists, it only makes a single tail- 
recursive call on the subjlist which contains the desired element. This small 
change ljowers the average coOmplexity to linear or On) time, and makes 让 
an 问 -place algorithm. A variation on this algorithm brings the worst-case 
time down to OOn) (See Selection algorithm for more information). 


Conversely, once we know a worst-case On) selection algorithm is 
available, we can use it to find the ideal pivot (the median) at every step of 
quicksort, producing a variant with worst-case On log D) running time. In 
practical implementations, however, this variant is Considerably slower on 
average. 


Competitive Sorting algorithms 


Quicksort is a Space-optimized version of the binary tree Sort. Instead of 
inserting items seduentially into an explicit tree, quicksort organizes them 
COncurrently into atree that is implied by the recursive calls. The 
algorithms make exactly the same comparisons, but in a different order. 


The most direct competitor of quicksort is heapsort, Heapsort is typically 
Somewhat slower than quicksort, but the worst-case running time is always 


OnlogD). Quicksort is usually faster, though there remains the chance of 
WwWorst case performance exXcept in the introsort variant. If is known 记 
advance that heapsort is going to be necessary, Using it directly will be 
faster than waiting for introsort to Switch to it. Heapsort also has the 
important advantage of using only constant additional space (heapsort is in- 
place), whereas even the best variant of quicksort uses OO(log Dn) Space. 
However heapsort requires efficient random access to be practical. 


Quicksort also competes with mergesort, another recursive sort algorithm 
but with the benefit of worst-case On log n) running time. Mergesort is a 
stable sort, unlike quicksort and heapsort, and can be easily adapted to 
operate on linked lists and very large lists stored on slow-to-access media 
Such as disk storage or network attached storage. Although quicksort can be 
Written to operate on linked lists, it will often suffer from poor pivot choices 
without random access. The main disadvantage of mergesort is that, when 
operating on arrays, it redquires GD) auxiliary space in the best case， 
whereas the variant of quicksort with in-place partitioning and tail recursion 
Uses only OUlog D) Space. (Note that when operating on linked lists， 
mergesort only requires a Small, constant amount of auxiliary storage.) 


6.2.4. MIerge sort 
(From Wikipedia, the free encyclopedia) 


In computer science, merge sort or mergesort is an On log n) Comparison- 
basedsorting algorithm. It is stable, meaning that it preserves the input order 
of equal elements in the sorted output. It is an example of the divide and 
conquer algorithmic paradigm. It was invented by Jobhn von Neumann in 
1945. 
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Merge Sort 


A merge Sort algorithm used to sort an array of 7 integer values. These are 
the steps a human would take to emulate merge Sorrt. 


Algorithm 
Conceptually merge Sort works as follows: 
1. Divide the unsorted list into two sublists of about half the Size 
2. Divide each of the two Sublists recursively until we have list Sizes of 
length 1, in which case the list itself is returned 
3. Merge the two Sublists back into one Sorted jist. 


Mergesort incorporates two main ideas to improve its runtime: 


1.A small list will take fewer steps to sort than a large list. 


2. Fewer Steps are redquired to construct a Sorted list from two Sorted lists 
than two unsorted lists. For example, you only have to traverse each 
list once 让 theyTe already sorted (See the merge function below for an 
example implementation). 

Example: Using mergesort to sort a list of integers contained in an arrayVy: 


Suppose we have an array A with indices ranging from A:first to ALast. 
We apply mergesort to A(A':first..A'centre) and A(centre+1..A'Last) - where 
centre is the integer part of (A:first + ALast)/2. When the two halves are 
returmed they wi 训 have been Sorted. They can now be merged together to 
form a sorted array. 


In a Simple pseudocode form, the algorithm could look something like this: 
function mergesort(m) 

var list left, right, result 

让 length(m)<1I 

returm m 

else 

var middle = length(m) /2 
for each Xin mupto middle 
add x to left 

for each xin m after middle 
addxXto right 

left = mergesort(lefb 


right = mergesort(right) 


result = merge(left, rightb) 
returmn result 


There are Several variants for the merge(O function, the simplest variant 
could look like this: 


function merge(leftright) 
var list result 

while length(lefb > 0 and length(rightb) > 0 
iffirst(efb < first(rightb) 
append first(left) to result 
left = rest(left) 

else 

append first(right) to result 
right = rest(Tight) 

计 Jength(lefb > 0 

append rest(lefb to result 
计 Jength(right) > 0 

append rest(Tight) to result 


retuIm reSsult 


C++ implementation 


Here is an implementation using the SIL algorithm std::inplace_merge to 
create an iterative bottom-up in-place merge Sort: 


##include <iostream> 

荐 nclude <vVector> 

基 nclude <algorithm> 

荐 nclude <iterator> 

int main() 

{ 

std::Vvector<unsigned> data; 
for(unsignedi= 0;i<10; i++) 
data.push_back(i); 
std::random_shuffle(data.begin(O, data.end()); 
std::cout << "Initial: ”; 


std::copy(data.begin(),data.end(),std::ostream_iterator<unsigned> 
(Std::Ccout,，”)); 


std::cout << std::end|; 

for(unsigned m = 1; m <= data.size(); m *= 2) 

{ 

for(unsignedi= 0;i< data.size(0 -mi+=m*+2) 
{ 


std::inplace_merge( 


data.begin() + i， 

data.begin() + i + mm， 

data.begin(0) + std::min<unsigned>(i + m* 2, (unsigned)data.size())); 
} 

} 

std::cout << "Sorted: "; 


std::CoOpy(data.begin(,data.end(),std::ostream_iterator<unsigned> 
(Std::Ccout,，”)); 


std::cout << std::end]; 


return 0; 


} 


Analysis 


In sorting n items, merge Sort has an average and worst-case performance of 
oO logn). Ifthe running time of merge sort for a list of length n is 工 D)， 
then the recurrence TD) = 2TI(n/2) +n follows from the definition of the 
algorithm (apply the algorithm to two lists of half the size of the original 
list, and add the n steps taken to merge the resulting two lists). The closed 
form follows from the master theorem. 


In the worst case, merge sort does approximately On [ljgn]1-2[lgn]l+1) 
comparisons, which is between (nljgn-n+1l)andnlgn+n+oOdgD)). 


[2] 


For largen and arandomly ordered input list, merge Sort's expected 
(average) number of comparisons approaches on fewer than the worst case 
Where 


co 1 
| 二 
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In the worst case, merge Sort does about 399%6 fewer comparisons than 
quicksort does in the average case; merge Sort always makes fewer 
comparisons than quicksort, except in extremely rare cases, when they tie， 
where merge Sorts worst case is found Simultaneously with quicksort's best 
case. Pn terms of moves, merge Sort's worst case COmplexity is OO log Dn) 一 
the Same complexity as quicksort's best case, and merge SoOrt's best Case 
takes about half as many iterations as the worst case. 


Recursive implementations of merge Sort make 2n - 1 method calls in the 
WwWorst case, COmpared to quicksort's n, thus has roughjly twice as much 
recursive Overhead as quicksort. However iterative, noOn-TrecursivVe， 
implementations of merge sort, avoiding method call overhead, are not 
difficult to code. Merge sorts most common implementation does not Sort 
in place; therefore, the memory size of the input must be allocated for the 
sorted outpnut to be stored in. Sorting in-place is possible but is Very 
complicated, and will offer little performance gains in practice, even 计 the 
algorithm runs in OO log D) time. In these cases, algorithms like heapsort 
usually offer comparable speed, and are far less complex. 


Merge sort is more efficient than quicksort for some types of lists 让 the data 
to be Sorted can only be efficiently accessed sedquentially, and is thus 
popular in languages Such as Lisp, where seduentially accessed data 
Structures are very common. Unlike Some (efficient) implementations of 
quicksort, merge Sort is a Stable sort as ljong as the merge operation is 
implemented properly. 


As can be seen from the procedure MergeSort, there are some complaints. 
One complaint we might raise is its use of 2n locations; the additional n 
locations were needed because one couldn't reasonably merge two sorted 
sets in place. But despite the use of this Space the algorithm must still work 
hard, copying the result placed into Result list back into m list on each calj 
of merge . An alternative to this copying is to associate a new field of 
information with each key. (the elements in m are called keys). This field 


will be used to link the keys and any associated information together in a 
sorted list (keys and related informations are called records). Then the 
merging of the sorted lists proceeds by changing the link values and no 
records need to moved at all. A field which contains only a link will 
generally be smaller than an entire record so less Space will also be used. 


Merge sorting tape drives 


Merge sort is So inherently sedquential that its Practical to run it using Slow 
tape drives as input and output devices. It requires very jlittle memory and 
the memory redquired does not change with the number of data elements. 下 
you have four tape drives, it works as follows: 


1. Divide the data to be sorted in half and put half on each of two tapes 

2. Merge individual pairs of records from the two tapes; write two-record 
chunks alternately to each of the two output tapes 

3. Merge the two-record chunks from the two output tapes into four- 
record chunks; write these altermately to the original two input tapes 

4. Merge the four-record chunks into eight-record chunks; write these 
altermately to the original two output tapes 

5. Repeat until you have one chunk containing all the data, sorted --- that 
js, for log n passes, where n is the number of records. 


For the Same reason it is also very useful for sorting data on disk that is too 
large to fit entirely into Primary memory. On tape drives that can run both 
backwards and forwards, you can run merge passes in both directions， 
avoiding rewind time. 


Optimizing merge sort 


This might seem to be of historical interest only but on modern computers， 
locality of reference is of paramount importance in Software optimization， 
because muljti-level memory hierarchies are used. In Some sense, main 
RAM can be Seen as a fast tape drive, level 3 cache memory as a slightly 


faster one, ljevel 2 cache memory as faster stijl, and so on. In Some 
circumstances, cache reloading might impose unacceptable overhead and a 
carefully crafted merge sort might result in a Significant improvement in 
running time. This opportunity might change 让 fast memory becomes very 
cheap again, or 让 exotic architectures like the Iera MTA become 
commonplace. 


Designing a merge Sort to perform optimally often requires adjustment to 
available hardware, eg. number of tape drives, or Size and Speed of the 
relevant cache memory levels. 


Typical implementation bugs 


Atypical mistake made in many merge sort implementations is the division 
of index-based lists in two sublists. Many implementations determine the 
middle index as outlined in the following implementation example: 


function merge(int left, int right) 
{ 

这 (left < right) { 

int middle = (left + right) / 2; 
[…] 


While this algorithm appears to work very well in most scenarios, it fails 
for very large lists. The addition of "left" and "right" would lead to an 
integer overflow, resulting in a completely wrong division of the list. This 
problem can be solved by increasing the data type Size used for the 
addition, or by aljtering the algorithm: 


int middle = left + ((Tright - lefb / 2); 


Note that the following two examples do not address the issue of integer 
overflow but dodge it under irrelevant efficiency claims 


Probably faster and arguably as clear is: 

int middle = (left + rightb) >>> 1; 

In C and C++ (where you dont have the >>> operator), you can do this: 
middle = ((unsigned) (left + rightb) >> 1; 


See more information here: 
http:/googleresearch.blogspot.com/2006/06/extra-extra-read-all-about-it- 
Dearlybhtml 


Comparison with other sort algorithms 


Although heapsort has the same time bounds as merge Sort, it requires only 
OO(1) auxiliary space instead of merge sorts On)j, and is often faster ip 
practical implementations. Quicksort, however is considered by many to be 
the fastest general-purpose Sort algorithm. On the plus side, merge Sort is a 
Stable Sort, parallelizes better and is more efficient at handling slow-to- 
access Seduential media. Merge Sort is often the best choice for sorting a 
lnked list: in this Situation it is relatively easy to implement a merge Sort 记 D 
Such a way that it requires only O(1) extra Space, and the Slow random- 
access performance of a linked list makes some other algorithms (Such as 
quicksort) perform poorly, and others (Such as heapsort) completely 
impossible. 


As of Perl 5.8, merge Sort is its default sorting algorithm (it was quicksort iPn 
previous versions of Per). In Java, the Armrays.sort() methods use mergesort 
oratuned quicksort depending on the datatypes and for implementation 
efficiency Switch to insertion sort when fewer than seven array elements are 
being sorted. 


Utility in online Sorting 


Mergesorts merge operation is useful in online Sorting, where the list to be 
Sorted is received a piece at a time, instead of all at the beginning (See 
online algorithm). In this application, we sort each new piece that is 


received using any Sorting algorithm, and then merge it into ouUr Sorted list 
so far using the merge operation. However this approach can be expensive 
in time and space 让 the received pieces are Small compared to the sorted list 
一 abetter approach in this case is to store the list in a self-balancing binary 
Search tree and add elements to it as they are received. 


Design Patterns for Sorting 

In 1985, Susan Merritt proposed a new taxonomy for comparison-based sorting algorithms. 
Atthe heart of Merritt's thesis is the Principle of divide and conquer. Merritt's thesis is 
potentially a very powerful method for studying and understanding Sorting. However the 
paper did not offer any concrete implementation of the proposed taxonomy. The following is 
our object-oriented formulation and implementation of Merritt's taxonomy. 


The following discussion is based on the the SIGCSE 2001 paper by Nguyen and Wong， 
"Design Patterns for Sorting"[footnote|]. 

D. Nguyen and S. Wong, “Design Patterns for Sorting,”SIGCSE Bulletin 33:1, March 2001， 
263-267 


MIerritt's Thesis 

In 1985, Susan Merritt proposed that all comparison-based sorting could be viewed as 
“Divide and Conduer” algorithms.[tootnote] That is, Sorting could be thought of as a procesSs 
wherein one first "divides" the unsorted pile of whatever needs to sorted into Smaller piles 
and then "conquers" them by sorting those smaller piles. Finally, one has to take the the 
smaller now sorted piles and recombines them into a single, now-Ssorted pile. 

S. Meritt "An Inverted Taxonomy of Sorting Algorithms,' Comm. of the ACM, Jan. 1985， 
Volume 28, Number 1, pp. 96-99 


We thus end up with a recursive definition of sorting: 
。 To sort a pile: 


o Split the pile into smaller piles 
o Sort the smaller piles 
o Join the Sorted Smaller piles into a Single pile 


We can See Merritt's recursive notion of sorting as a split-sort-join process in a pictoral 
manner by considering the general sorting process as a "black box" process that takes an 
Unsorted set and returns a Sorted set. Merritt's thesis thus contends that this Sorting process 
can be described as a splitting followed by a sorting of the smajljer pieces followed by a 
joining of the sorted pieces. The Smaller sorting process can thus be similarly described. The 
base case of this recursive process is when the Set has been reduced to a Single element， 
upon which the sorting process cannot be broken down any more as it is a trivial no-op. 
Animation of the Merritt Sorting Thesis (Click the "Reveal More" button) 


Sorting can be Seen as a recursive Process that spjlits the unsorted items into multiple 
Unsorted sets, sorts them and then rejoins the now sorted sets. When a set is reduced to a 
Single element (blank boxes above), sorting is a trivial no-op. 


Merritt's thesis is potentially a very powerful method for studying and understanding 
sorting. In addition, Merritt's abstract characterization of sorting exhibits much object- 
oriented (OO) flavor and can be described in terms of OO concepts. 


Capturing the Abstraction 

So, how do we capture the abstraction of sorting as described by Merritt? Fundamentally, we 
have to recognize that the above description of sorting contains two distinct parts: the 
invariant process of splitting into Sub-piles, sorting the Sub-piles and joining the sub-piles， 
and the variant processes of the actual splitting and joining algorithms used. 


Here, we will restrict ourselves to the process of Sorting an array of objects, in-place -- that 
is, the original array is mutated from unsorted to Sorted (as opposed to returning a new array 
of sorted values and leaving the original untouched). The Comparator object used to 
compare objects will be given to the Sorter's constructor. 

Abstract Sorter Class 


Concrete 
“Template Method” 


#jnt splitfObjec 妇 Aintio, intnib abstract， 


# void: joinfObjec 奶 Aintio, int s, int hi relegated to 
subclasses 


The invariant sorting process is represented as an abstract 
Class 


Here, the invariant process is represented by the concrete Sort method, which performs the 
Split-sort-Sort-join process as described by Merritt. The variant processes are represented by 
the abstract SpLIt and join methods, whose exact behaviors are indeterminate at this 
time. 


Above the methods are defined as following: 


final void Sort(Object [] A，int Lo，int hiI) --sorts the given unsorted 
array of objects, A, defined from index Lo to index hi, inclusive. This method is 
implemented here and marked finaj to enforce its invariance with respect to the 
Subclasses. It is this method that implements Merritt's split-sort-join procesSs. 


abstract int SplLit(object [] A，int 1o，int hiI) --splits the given 
unsorted array of objects, A, defined from index Lo to index hl, inclusive, into two adjacent 
Sub-arrays. The returned index is the index of the first element of the upper Sub-array. The 
implementation of this abstract method is in the Sub-classes. 


abstract void join(object [] A，int 1]o，int Ss，int hi) --joins 
two sorted adjacent sub-arrays of objects in the array A, where the lower Sub-array is from 
index lo to index s, inclusive, and the upper Sub-array is from index S to index h 工 ， 
inclusive. The implementation of this abstract method is in the subclasses. 


Here's the full code for the abstract ASorter class, the abstract Superclass for all concrete 
Sorters and the implementation of Merritt's template for Sorting: 
ASorter Class 


package Sorter 
pubJlic abstract class ASorter 


protected Aorder aorder ; 

天 类 
* The constructor for this class ， 
* @param aorder The abstract ordering strategy to be used by any Subc1lass ， 
*/ 

protected ASorter(Aorder aorder ) 


this,aorder = aorder ; 


】} 


涩 
* Sorts by doing a split-sort-sort-join， Splits the orliginal array into 
two Subarrays， 
* recursively sorts the Split subarrays，then re-joins the sorted Subarrays 
together ， 
* This is the tempJlate method,， It calls the abstract methods Split and 
join to do 
* the work,， A]11 comparison-based sorting algorithms are concrete 
Subclasses with 
* Specific split and join methods ， 
* @param A the array A[]lo:hi] to be Sorted ， 
* @param 1o the low index of A， 
* @param hi the high index of A， 
*/ 
public final void sort(Object[] A，ijint 10o，ijint hi) 


if (1o < hil) 


int S = Split (A，1o，hiy)， 
Sort (A，10，S-1)， 

Sort (A，S，hI)， 

join (A，1o，S，hI)， 


】} 


人 

* Splits A[lo:hi] into A[lo:s-1] and A[s:hi] where S is the returned value 
of this function， 

* @param A the array A[]lo:hi] to be sorted ， 

* @param 1o the Low index of A， 

* @param hi the high index of A， 

*/ 


protected abstract int Split(Oobject[] A，int 1o，int hI)， 


* Joins Sorted A[lo:s-1] and sorted A[s:hi]l into A[lo:hIl]， 
* @param A A[lo:s-1] and A[s:hi]l are Sorted ， 
* @param 1o the low index of A， 
* @param hi the high index of A， 
“7 
protected abstract void join(Object[] A，int 1o，int S，int hI)， 


人 
* An acceSssor method for the abstract ordering Strategy ， 
* @param ao0rder 
*/ 

public void setorder(Aorder aorder ) 


this,aorder = aorder ; 


】} 


Note: AOrder is an abstract ordering operator whose concrete implementations define the 
binary ordering for the object being sorted. The examples below, only use the 

Aorder .1Lt(Object x，oObject y) method, which returms true 让 X < y.The 
sorting framework could easily be modified to use java.ut1lIlL.Comparator instead 
with no loss of generality. 


Template Design Pattern 
The invariant sorting process as described by Merritt is an example of the Template Method 
Design Pattern . 

Template Method Design Pattern 


+invariant(...); 
#VariantTf， 履 
##vVarrant2f.. 片 


White-box Framework: 


##variant1{(...); #variant1{...); 
#Variant2{(...); #Variant2(...); 


The TIemplate Method Design Pattern describes an 
invariant CoOncrete process in terms of variant, abstract 


Imethods. 


Here, the invariant process is represented by a concrete method of an abstract Superclass. 
This concrete method's implementation is in terms of abstract methods of the same class. 
These abstract methods represent the variant processes and are implemented in the Sub- 
classes. This type of class organization where the variant processes are relegated to Sub- 
classes is also known as a white box framework. 


Concrete Sorters 


In order to create a Sorter that can actually perform a sorting operation, we need to Subclass 
the above ASorter class and implement the abstract SpLit and join methods. It should 
be noted that in general, the SplLit and join methods form a matched pair. One can argue 
that it is possible to write a universal join methods (a merge operation) but it would be 
highly inefficent in most cases. 


Example: 

Selection Sort 

Tradionally, an in-place Selection Sort is performed by selecting the Smallest (or largest) 
value in the array and placing 让 ip theright-most location by either swapping it with the 
right-most element or by shifting all the iD-between elements to the left. The selection and 
Swapping/shifting process then repeated with the sub-array to the left of the newly placed 
element. This continues until only one element remains in the array. A_ selection Sort is 
commonly used to do Something like a Sort group of people into ascending height. 

Below is an animation of atraditional selection sort algorithm: 

Traditional Selection Sort Algorithm 


The extrema values are removed from an evVer-shrinking 

Unordered set and placed into the resulting Sorted array. 

Here, the smallest values are removed from the left and 
placed to the right in the array. 


In terms of the Merritt Sorting paradigm, a selection sort can be broken down into a splitting 
process that is the Same as the above selection process and a trivial join process. Looking at 
the above selection and swap/shift process, we See that it is describing a the splitting off of 
asingle element, the Smallest, from an array. The Process repeats recursively until there is 
nothing more to Split off. The sorting of a single element is ano-op, So after that the 
recursion rolls back out though the joining process. But the joining process is trivial, ano- 
op, because the elements are already in their corret positions. The beauty of Merritt's insight 
is the realize that by considering ano-op as an operational part of a Process, all the different 
types of binary comparison-based Sorting could be unified under a common framework. 
Belowis an animation of a Merritt selection sort algorithm: 

Merritt Selection Sort Process 


The splitting process splits off one element at a time, the 
Smallest element, from the left and Placed to the right iD 
the array. The join Process is a no-op because the 
elements are already in their correct places. 


The code to implement a selection Sorter is straightforward. One need only implement the 
Sp1lLitand join methods where the split method always returns the Lo+1 index because 
the smallest value in the (Sub-)array has been moved to the index Lo position. Because the 
bulk of the work is being done in the splitting method, selection sort is classified as an 
"hard split, easy join" sorting process. In the Java implementation of the SelectionSorter 
class below, the SpJit method splits off the extrema (minimum, here) value from the sub- 
array, while the join method is a no-op. 

SeJlectionSorter cJlass 


package Sorter ， 


人 


* A_ concrete sorter that Uses the Selection Sort technidue， 
0 
pub]lic class SelectionSorter extends ASorter 


{ 


pe 
* The constructor for this class ， 
* @param ICompareop The comparison Strategy to use in the Sorting， 
0 

pub]lic SelectionSorter(Aorder ICompareoOp) 


Super(ICompareoOp ) ; 


玉生 让 
* Splits A[lo:hi] into A[lo:s-1] and A[s:hi] where S is the returned value 
of this function， 
* This method places the "Smallest"” value in the Jo position and Splits it 
of ， 
* @param A the array A[lo:hi]j to be Sorted ， 
* @param 1o the Low index of A， 
* @param hi the high index of A， 
* @return Lo+1 always 
泡 / 
protected :int Split(Object[] A，int lo，ijint hi) 
{ 
GES 汪 三 和 IOYy 
RE 二 区 oo 
// Invariant: A[S] <= AL[1o:I-1]， 
// Scan A to find minimum : 
whlile (II <= hi) 


if (aorder, lt(A[Ii]，A[s])) 
SS 三 3 
I++， // Invariant Is maintalned . 
} // on loop exit: 工 = hi+ 1 also invariant Still1 holds， 
// this makes A[S] the minimnum of A[lo:hil]. 


// Swapping A[1o] with A[s]: 
Object temp = A[1o]， 

A[1o] = A[s]， 

A[s] = temp， 

return 1Lo + 工 ; 


】} 


Fi 
* Joins sorted A[lo:s-1] and sorted A[s:hi] into A[lo:hI]， 
* This method does nothing,， The sub-arrays are already in proper order， 
* @param A A[l1o:s-1] and Ar[s:hi]l are sorted. 
* @param lo the low index of A， 
* @param S 
* @param hi the high index of A， 
EX 
protected void join(Object[] A，int lo，ijint Ss，int hiI) 
{ 
】} 
】} 


Whats interesting to note here is what is missing from the above code. A tradional selection 
Sort aalgorithm is implemented using a nested double loop, one to find the Smallest value 
and one to repeatedly process the ever-shrinking unsorted Sub-array. Notice that the above 
code only has a single loop, which coresponds to the inner loop of a traditional 
implementation. The outer ljoop is embodied in the recursive nature of the Sort template 
method in the ASorter Superclass. 

Notice also that the Selection Sorter implementation does not include any explicit 
connection between the split and join operations nor does it contain the actual Sort 
method. These are all contained in the concrete Sort method of the superclass. We 
describe the SelectionSorter class as a component in a framework (technically a 
"white box" framework, as described above). Frameworks display inverted control where 
the components provide services to the framework. The framework itself runs the 
algorithms, here the high level, templated sorting process, and call upon the services 
provided by the components to fill in the necessary processing pieces, e.g. the split and join 
procedures. 


Example: 

Insertion Sort 

Tradionally, an in-place insertion Sort is performed by starting from one end of the arry, say 
the left end, and performing an in-order insertion of an element into the Sub-array to its left. 
The next element to the right is then chosen and the insertion process repeated. At each 
insertion, the Sorted sub-array on the left grows until encompasses the entire array. An 
insertion sort is a Very typical way iD which people will order a set of playing cards in their 
hand. 

Below is an animation of atraditional insertion sort algorithm: 

Traditional Insertion Sort Algorithm 


Starting from the left, elements from the immediate right 
are inserted into a growing Sub-array to the left. 


In the Merrit paradigm, the insertion Sort first Splits the array or Sub-array into two pieces 
Simply by separating theright-most element. Recursively, the Splitting Process proceeds to 
from the right to the left until a single element is left in the sub-array. Sorting a one element 
array is ano-op, So then the recursion unwinds with the join process. The join process 
combines each single split-off element with its sorted Sub-array partner to its left by 
performing an iD-order insertion. This proceeds as the recusion unwinds until the entire 
array is fully sorted. Im contrast to the selection sort, the bulk of the work is being done 记 D 
the join method, hence classifying insertion sort as an "easy split, hard join" sorting 
PiocesS. 

Below is an animation of a Merritt insertion Sort algorithm: 

Merritt Insertion Sort Process 


The right-most elements are first split-off one by one， 
starting at the right and moving left. The Split-o 娃 
elements are then joined by performing an in-order 
insertion to the left, starting at the left. 


In the Java implementation of the Selection sorter below, the Spit method simply splits 
off the right-most element of the sub-array. The join method performs an in-order 
insertion of the Single Split-off element into the larger Sub-array to its left. 
InsertionSorter cJlass 


package Sorter ， 


jh 

* A concrete Sorter that uses the Insertion Sort technidque ， 
人 

pub1lic class InsertionSorter extends ASorter 


{ 


人 
* The constructor for this class ， 
* @param ICompareop The comparison Strategy to use in the Sorting， 
0 

pub]lic InsertionSorter(Aorder ICompareop) 


Super(ICompareop ) ; 


】} 
3 
* Splits A[lo:hi] into A[lo:s-1] and A[s:hi] where Ss is the returned value 
of this function， 
* This Simply Splits off the element at index hi， 
* @param A the array A[lo:hi] to be sorted ， 
* @param lo the low index of A， 
* @param hi the high index of A， 
* @return hi always ， 
2 
protected :int Split(Object[] A，int lo，ijint hi) 


return (hI) ， 


】} 
ht 
* Joins sorted A[lo:s-1] and sorted A[s:hi] into A[lo:hI]， (S = hi) 
* The method performs an in-order insertion of A[hi] into the A[Lo，hi-1] 
* @param A A[lo:s-1] and Ar[s:hi]l are sorted. 
* @param lo the low index of A， 
* @param S 
* @param hi the high index of A， 
0 
protected void join(Object[] A，int lo，ijint Ss，int hiI) 
{ 


int j] = hz/ /remember S == hi， 

Object key = A[hil]， 

// Invariant: A[1o:j-1] and A[j+1:hi] are Sorted and key < al1 
elements of A[Jj+1:hiIl].， 

// Shifts elements of AL[1o:j-1] that are greater than key to the 
"right”to make room 


// for key， 
whjlile (1lo < j] && aorder,.lLt(key，A[j-1])) 
{ 
A[] = AD-1]， 
A[Jj-1] = key: 
| 讨 二 到 | 全 0 // invariant Is maintained ， 
】} // 0n loop exit: j = lo or A[j-1] <= key， Also invariant 1IS StiIl1 
true . 
ZN/ A[] = key; 
】} 
】} 


匡 XxXercise: 


Problem : 


The authors were once challenged that the Merritt template-based sorting paradigm 
could not be used to describe the Shaker Sort process (a bidirectional Bubble or 
Selection sort). See for instance, http:/en.wikipedia.orgAwikiCocktail _ sort, However 六 
can be done is a very straightforward manner. There are anumber of viable solutions. 
Hint: think about the State Design_ Pattern. 


Solution: 


The solution is left to the student but is available from the authors if proof of non- 
Student status is Provided. 


For more examples, please see download the demo_ code. Please note that the ShakerSort 
code is disabled due to its use as a Student eXercise. 


Sorting an Array 
An introduction and example of sorting an array within the C++ 
programming language. 


Overview 


Sorting is the process through which data are arranged according to their 
values. There are several Sorting algorithms or methods that can be used to 
Sort data. Some include: 


1. Bubble 
2. Selection 
3. Insertion 


We will not be covering the selection or insertion Sort methods in this 
modujle. 


"The bubble sort is an easy way to arrange data in ascending or descending 
order. If an array is sorted in ascending order it means the values in the 
array are Stored from lowest to highest. 革 values are Sorted in descending 
order they are stored from highest to jlowest. Bubble Sort works by 
Comparing each element with its neighbor and swapping them it they are 
not in the desired order."[footnote| 

Tony Gaddis, Judy Walters and Godfrey Muganda, Startipg Out with C++ 
EanyObjects Sixth Edition (United States of America: Pearson - Addison 
Wesley, 2008) 569. 


There are Several different methods of bubble sorting and some methods are 
more efficient than others. Most use a pair of nested loops or iteration 
control structures. One method sets a flag that indicates that the array is 
sorted, then does a pass and ii any elements are exchanged (Switched); 让 
sets the flag to indicate that the array is not Sorted. It is executed until 让 
makes a pass and nothing is exchanged. 


Thisbubblesort set5 aflagthatindicatesthatthe arrayis Sorted (that is it does not need 
more sorting)j, then doesapassand ifanyelements are exchanged (Switched); it sets the 
flagto indicatethatthe arrayis not sorted (that is it needs more sorting). The outer do 
whileloop is executed untilthe innerfor loop makesa passand nothing is exchanged. 


Here is some colorhighlighted C++ code from Demo _sort_Array_Function.cpp 


do 
{ 
moresortneeded = false: 
for (int 1=0: 1 < array size - 1: i++) 
{ 
iffthings[i]l] > things[i+l]) 
{ 
temp = things[i]: 
things[i]l = things[i+tl]:; 
things[i+l] = temp: 
moresortneeded = true : 
} 
} 
} 


while (moresortneeded) : 


The bubble sort gets its name from the lighter bubbles that move or "bubble 
Up" to the top of a glass of soda pop. We move the Smaller elements of the 
array to the top as the larger elements move to the bottom of the array. This 
can be viewed from a different perspective. USsing an Italian salad dressing 
with oil, water and herbs; once Shaken you can either: 


1. envision the lighter oil rising to the top; OR 
2. enVision the heaver water and herbs sinking to the bottom 


PEither way is correct and this version of the code Simply demonstrates the 
Sinking to the bottom the heaver or larger elements of the array. 


Bubble sorting is demonstrated in the demo file provided, thus you need to 
study this material in conjunction with the demo Program. 


Demonstration Program in C++ 


Creating a Folder or Sub-Folder for Source Code Files 


Depending on your compilevVIDE, you should decide where to download 
and store Source code files for processing. Prudence dictates that you create 
these folders as needed prior to downloading source code files. A_ suggested 
Sub-folder for the Bloodshed Dev-C++ 5 compilerIDE might be named: 


。 Demo_Programs 


If you have not done So, please create the folder(s) and/or sub-folder(s) as 
appIopriate. 


Download the Demo Program 


Download and store the following file(s) to your Storage device in the 
appropriate folder(S). Following the methods of your compilerYIDE， 
compile and run the program(s). Study the source code file(s) in 
conjunction with other learning materials. You may need to right click on 
the link and select "Save TIarget As" in order to download the file. 


Download from Connexions: Demo_Sort Array_ Function.cpP 


Download from Connexions: Demo_Farm Acres_Inputtxt 


Definitions 


Sorting 
Arranging data according to their values. 


bubble sort 
A method of swapping array members until they are in the desired 
Sequence. 


Graphical Convolution Algorithm 
cO= 人 Anet-ndr 


Step One 


Plot Fr) and 9g(7) as functions of 7 


fr) gz) 


Step Two 


Plot 9g( 一 7) by reflecting 9g(T) over the Vy-axis' ( run time backwards) and 
then shifting right by 世 


g(-) gt ) 


gr) >0 


Step Three 


For one value of 如 mutiply Fr)9g(t 一 7) and compute area underneath the 
curve to get c( 臣 . Area underneath 
Equation: 


人 了 


fr ) 


Step Four 


Repeat forall 凡 to get clt) for allt. Usually we will just have to consider 
several ranges of t. 


-1 
product=0Y t<-1 


Step Five 


Reality check: Does your anSswer actually make sense? 


Remark 


Since， 
Equation: 


cf = 三 -jr)odt-7)dr 
(一 


-g(T)jt 一 7T)dr 


you can flip and shift either f or g. It is easier to flip and shift the 'simpler' 
of the two. 


[missing_resource: ] 


Note: Everyone is overwhelmed by convolution at first! Just practise and 让 
will become second nature. Do examples 2.6to 2.8in Lathil 


Example: 
Recall 


O 
Now compnute output Vy(t) for a step input 丰 t)v (人吉 


Solution 


System is LII with impulse response 瑚 (加 , So use 
convolution integral 


"= 人 ynt-nar 
Since, fr) is simpler we rewrite it as 


/ne-nDar 


OO 


Step 工 


Plot things 


h(z) fr = utb 
民 二 


Step 2 


Do the flip and shift. 


fT ) 
国 

全-) 
人 


fEE ) 


t 
ft ) 
一 上 


Step 3 贸 4 


Multiply and integrate. 


Case 工 


FEor 上 < 0 


hz) 


fft- ) 


From the fact stated in the caption， 


厂 人 


OO 


Case 2 


Fort 之 0 


0) 


hz ) 


frt-E ) 


了 Equation: 
攻关 生生 和风 
三 人 
= 入 e 到 | 0 
三 1 =- 一 0 
Answer 
四 让 上 <0 
ES 


Step 5 


Do areality check: Asttends to oo what happens" As ttends 
to -oo What happens? 


E 上 Example: 
The input is /jb = eandthe impulse response is 万 (四 一 e 一 (0290). 
Compnute the y( 加 ). 


solution 


Weare given input and impulse response. So ride the 
conVolution conVoy! 


9O= / 人 


Both the functions are edqually simple, so we fip and shift 
及 人 


hz ) 


Case TI 


Again y( 切 =0foralt<0 


t<0 


ff ) 


IE 


htt-z ) 
f 


product and integral =0 Yt<0 


Case 2 


Fort 之 0 


ff ) 
sn 下 

[1 工 ) 

多 htt-z ) 


了 Equation: 


岂 匠 三 全 7)Pt 一 FT) d7 
到 本 人 OA 辣 二 
一 由 Eee 
一 凡 er-(t2ber d7 

0 帮 e7r d7 

e 一 (2 约 e7 


澳 三 全 
一 ee- 全 (ef 一 1) 


JU 


Combine Case 1 and 2 
Equation: 


二 | 计 上 <0 


Algorithm Overview 


Algorithm Overview 


The first step in detecting a signal is to input it into the System. Since we are 
Using an audio signal, a microphone is the obvious choice. However the 
desired control signal is not the only Sound in the room. Ihere is also the 
music that is being played through the speakers as wejll as outside noise. 
Unfortunately, there is not much that can be done abonut the random noise 
that is present in the Toom. 瑞 owever there are tools available that allow us 
to minimize the interference caused by the music. Specifically we can use 
an adaptive filter to mimic the room?s effect on the output of the Sound 
card, providing us with an estimate of the music's contribution to the signal 
received by the microphone. We can Subtract the microphone's signal from 
this estimate in order to _ obtain 一 hopefully 一 only the whistle. 


Eigure 1 shows the block diagram of our System. All of these components 
fall into four main categories: 


1. Signal acquisition 

2. Whistle isolation 

3. Whistle frequency analysis 
4. iTunes interface 


The acquisition phase is the top portion of the diagram, the whistle isolation 
system is represented by theh box and the band pass filter, the rest of the 
diagram (except for the Java controllem comprise the whistle analysis 
phase, and the Java controller is the iTunes interface. 

Our System's Block Diagram 


= DT Derivative 
D| 


1sum1 > threshold Java Controller | 


In our System, the Sound is output through the Speaker， 
and the microphone receives the music and whistles 
while the sound card receives the audio without the 

room affecting it. We then remove the music and 
process the whistle. 


After isolating the whistle, we apply a band pass filter whose pass band 
CoOrresponds to common whistle frequencies in order to remove extraneous 
noise outside of these whistle frequencies. 


We then take the Short Time Fourier Iransform in order See how the 
frequency components of the whistle change over time. If the fredquency of 
the whistle is increasing iTIunes should advance to the next track, and a 
decreasing frequency will skip to the previous track. Io accomplish this, we 
examine the frequency with maximum power (the argmax in the 吾 gure 
below) and accumulate several readings of this frequency. In order to see 让 
this function is increasing or decreasing we take the derivative and examine 
its average value. If the average value is positive, the function must have 
been increasing and the whistle must have been from high to low 
frequencies. 


